Enhanced version of AdaBoostM1 with J48 Tree learning method
نویسندگان
چکیده
Machine Learning focuses on the construction and study of systems that can learn from data. This is connected with the classification problem, which usually is what Machine Learning algorithms are designed to solve. When a machine learning method is used by people with no special expertise in machine learning, it is important that the method be ‘robust’ in classification, in the sense that reasonable performance is obtained with minimal tuning of the problem at hand. Algorithms are evaluated based on how ‘robust’ they can classify the given data. In this paper, we propose a quantifiable measure of ‘robustness’, and describe a particular learning method that is robust according to this measure in the context of classification problem. We proposed Adaptive Boosting (AdaBoostM1) with J48(C4.5 tree) as a base learner with tuning weight threshold (P) and number of iterations (I) for boosting algorithm. To benchmark the performance, we used the baseline classifier, AdaBoostM1 with Decision Stump as base learner without tuning parameters. By tuning parameters and using J48 as base learner, we are able to reduce the overall average error rate ratio (errorC/errorNB) from 2.4 to 0.9 for development sets of data and 2.1 to 1.2 for evaluation sets of data.
منابع مشابه
A Perspective Analysis of Traffic Accident using Data Mining Techniques
Data Mining is taking out of hidden patterns from huge database. It is commonly used in a marketing, surveillance, fraud detection and scientific discovery. In data mining, machine learning is mainly focused as research which is automatically learnt to recognize complex patterns and make intelligent decisions based on data. Nowadays traffic accidents are the major causes of death and injuries i...
متن کاملAre Decision Trees Always Greener on the Open (Source) Side of the Fence?
This short paper compares the performance of three popular decision tree algorithms: C4.5, C5.0, and WEKA’s J48. These decision tree algorithms are all related in that C5.0 is an updated commercial version of C4.5 and J48 is an implementation of the C4.5 algorithm under the WEKA data mining platform. The purpose of this paper is to verify the explicit or implied performance claims for these alg...
متن کاملClassification of data using New Enhanced Decision Tree Algorithm (NEDTA)
__________________________________________________________________________________________ Abstract: Data mining is method of maintaining a large amount of data stored in the database. Decision tree is a technique of data mining which classify the data and produces valuable results. These results are used in analysis and future prediction. The prime objective of this research work is to present...
متن کاملAvoiding the Look-Ahead Pathology of Decision Tree Learning
Most decision-tree induction algorithms are using a local greedy strategy, where a leaf is always split on the best attribute according to a given attribute selection criterion. A more accurate model could possibly be found by looking ahead for alternative subtrees. However, some researchers argue that the look-ahead should not be used due to a negative effect (called ―decision tree pathology‖)...
متن کاملFingerprint Gender Classification using Univariate Decision Tree (J48)
Data mining is the process of analyzing data from a different category. This data provide information and data mining will extracts a new knowledge from it and a new useful information is created. Decision tree learning is a method commonly used in data mining. The decision tree is a model of decision that looklike as a tree-like graph with nodes, branches and leaves. Each internal node denotes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.03522 شماره
صفحات -
تاریخ انتشار 2013